Towards Scalable Parallelism in Monte Carlo Particle Transport Codes Using Remote Memory Access
نویسندگان
چکیده
One forthcoming challenge in the area of high-performance computing is having the ability to run large-scale problems while coping with less memory per compute node. In this work, we investigate a novel data decomposition method that would allow Monte Carlo transport calculations to be performed on systems with limited memory per compute node. In this method, each compute node remotely retrieves a small set of geometry and cross-section data as needed and remotely accumulates local tallies when crossing the boundary of the local spatial domain. Initial results demonstrate that while the method does allow large problems to be run in a memory-limited environment, achieving scalability may be difficult due to inefficiencies in the current implementation of RMA operations.
منابع مشابه
Memory Bottlenecks and Memory Contention in Multi-Core Monte Carlo Transport Codes
Current and next generation processor designs require exploiting on-chip, fine-grained parallelism to achieve a significant fraction of theoretical peak CPU speed. The success or failure of these designs will have a tremendous impact on the performance and scaling of a number of key reactor physics algorithms run on next-generation computer architectures. One key example is the Monte Carlo (MC)...
متن کاملComparison of MCNP4C, 4B and 4A Monte Carlo codes when calculating electron therapy depth doses
ABSTRACT Background: accurate methods of radiation therapy dose calculation. There are different Monte Carlo codesfor simulation of photons, electrons and the coupled transport of electrons and photons. MCNPis a general purpose Monte Carlo code that can be used for electron, photon and coupledphoton-electron transport.Monte Carlo simulation of radiation transport is considered to be one of the ...
متن کاملThe effect of load imbalances on the performance of Monte Carlo algorithms in LWR analysis
A model is developed to predict the impact of particle load imbalances on the performance of domain-decomposed Monte Carlo neutron transport algorithms. Expressions for upper bound performance "penalties" are derived in terms of simple machine characteristics, material characterizations and initial particle distributions. The hope is that these relations can be used to evaluate tradeoffs among ...
متن کاملThe energy band memory server algorithm for parallel Monte Carlo transport calculations
An algorithm is developed to significantly reduce the on-node footprint of cross section memory in Monte Carlo particle tracking algorithms. The classic method of per-node replication of cross section data is replaced by a memory server model, in which the read-only lookup tables reside on a remote set of disjoint processors. The main particle tracking algorithm is then modified in such a way a...
متن کاملAdaptive Runtime Support for Direct Simulation Monte Carlo Methods
In highly adaptive irregular problems such as many Particle-In-Cell (PIC) codes and Direct Simulation Monte Carlo (DSMC) codes, data access patterns may vary from time step to time step. This uctuation may hinder eecient utilization of distributed memory parallel computers because of the resulting overhead for data redistribution and dynamic load balancing. To eeciently parallelize such adap-ti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011